Average profiles, from tries to suffix-trees
نویسنده
چکیده
We build upon previous work of Fayolle (2004) and Park and Szpankowski (2005) to study asymptotically the average internal profile of tries and of suffix-trees. The binary keys and the strings are built from a Bernoulli source (p, q). We consider the average number pk,P(ν) of internal nodes at depth k of a trie whose number of input keys follows a Poisson law of parameter ν. The Mellin transform of the corresponding bivariate generating function has a major singularity at the origin, which implies a phase reversal for the saturation rate pk,P(ν)/2 as k reaches the value 2 log(ν)/(log(1/p) + log(1/q)). We prove that the asymptotic average profiles of random tries and suffix-trees are mostly similar, up to second order terms, a fact that has been experimentally observed in Nicodème (2003); the proof follows from comparisons to the profile of tries in the Poisson model.
منابع مشابه
The Average Profile of Suffix Trees
The internal profile of a tree structure denotes the number of internal nodes found at a specific level of the tree. Similarly, the external profile denotes the number of leaves on a level. The profile is of great interest because of its intimate connection to many other parameters of trees. For instance, the depth, fill-up level, height, path length, shortest path, and size of trees can each b...
متن کاملCompact Suffix Trees Resemble PATRICIA Tries: Limiting Distribution of the Depth
Suffix trees are the most frequently used data structures in algorithms on words. In this paper, we consider the depth of a compact suffix tree, also known as the PAT tree, under some simple probabilistic assumptions. For a biased memoryless source, we prove that the limiting distribution for the depth in a PAT tree is the same as the limiting distribution for the depth in a PATRICIA trie, even...
متن کاملAnalysis of the average depth in a suffix tree under a Markov model
In this report, we prove that under a Markovian model of order one, the average depth of suffix trees of index n is asymptotically similar to the average depth of tries (a.k.a. digital trees) built on n independent strings. This leads to an asymptotic behavior of (log n)/h + C for the average of the depth of the suffix tree, where h is the entropy of the Markov model and C is constant. Our proo...
متن کاملOn the Number of 2-Protected Nodes in Tries and Suffix Trees
We use probabilistic and combinatorial tools on strings to discover the average number of 2-protected nodes in tries and in suffix trees. Our analysis covers both the uniform and non-uniform cases. For instance, in a uniform trie with n leaves, the number of 2-protected nodes is approximately 0.803n, plus small first-order fluctuations. The 2-protected nodes are an emerging way to distinguish t...
متن کاملSuffix Trees and Simple Sources
Using an intricate method, Jacquet and Szpankowski [2] compared the depth of insertion into suffix-trees and tries in the non-uniform Bernoulli model, as well as the average size of suffix-trees and tries under the same model. They proved that the depth of insertion has asymptotically the same probabilistic behaviour in both cases, and that the average sizes of a trie and a suffix-tree built wi...
متن کامل